logo

This script explores Three Rivers Park District (TRPD) data on the wild lupine (Lupinus perennis) population at the Crow-Hassan Park Reserve (CHPR). Park managers are evaluating the potential reintroduction of the Karner blue butterfly (Lycaeides melissa samuelis; KBB) to CHPR. Wild lupine is the sole food plant for KBB caterpillars, so a sufficient and stable population of lupine at CHPR is a reasonable prerequisite for butterfly reintroduction. CHPR has monitored vegetation in restored prairies since 2000, allowing us to assess aspects of the plant’s status in the park. I find that stem desnities in 3 large managment units exceed minimum reuqired densities suggested in the literature for KBB reintroduction. Lupine percent cover was also stable at the landscape scale from 2013 to 2015.

Wild lupine (*L. perennis*)

Wild lupine (L. perennis)

Setup

Load packages & data

# Loading required packages for analysis:
library(knitr)
library(dplyr)
library(ggplot2)
library(ggthemes)
library(effects)
library(here)

source(here("Scripts", "1_Data_Import.R"))
source(here("Scripts", "2_Spatial_Data_Import.R"))
source(here("Scripts", "3_Data_Cleaning.R"))
source(here("Scripts", "4_Covariate_Data_Prep.R"))
source(here("Scripts", "5_Prep_Species.R"))

Data manipulation

The only TRPD dataset with lupine records is the the Prairie/Old Field vegetation survey. Here we first we add a new column for survey ID (a unique string combining area, site, and date) to this dataset, then select CHPR surveys, then print the first few rows for example.

veg_raw <- veg.tblDataPrOF04Circles %>%
  mutate(survey_string = paste0(Area, "-", `Site#`, "-", Date)) # unique code for survey based on area, site #, and date
veg_raw <- veg_raw[grep("CHPR", veg_raw$Area, ignore.case = TRUE), ] # select records from CHPR
head(veg_raw)

Now select records of lupine from the dataset and print the first few rows for example:

lupine_raw <- veg_raw %>%
  filter(`Common Name` == "Lupine, wild")
head(lupine_raw)

The species and percent cover of grasses and herbs are recorded in four 1-meter diameter plots placed in the cardinal directions from the plot center. For this initial data exploration I will simply combine data from each of the four plots:

  • total_tally = sum of all 4 quadrats (NESW) for each survey

  • mean_percent_cover = combine the 4 quadrats and recalculate percent cover

  • quads_w_lupine = how many of the 4 quadrats had lupine

lupine <- lupine_raw %>%
  group_by(survey_string, Area, `Site#`, Date) %>%
  summarize(
    total_tally = sum(Tally), # combine tallies from each of four plots
    mean_percent_cover = sum(`% cover`) / 400, # re-calucalute mean cover across all four plots
    quads_w_lupine = n_distinct(NSEW)
  ) # how many of the 4 plots had lupine?
head(lupine)

For some analyses/plots it will be good to know when lupine were NOT observed. This code back-fills zeroes for surveys where lupine was not observed:

# generate list of surveys without any lupine
all_surveys <- veg_raw %>%
  group_by(survey_string, Area, `Site#`, Date) %>%
  count()
all_surveys_vector <- all_surveys$survey_string # vector with all surveys
surveys_with_lupine_vector <- lupine$survey_string # vector of only subset of surveys with lupine
surveys_without_lupine_vector <- setdiff(all_surveys_vector, surveys_with_lupine_vector) # vector of surveys without lupine
# create dataframe for surveys without lupine with zero values for lupine measurements
surveys_without_lupine <- data.frame(
  survey_string = surveys_without_lupine_vector,
  total_tally = 0,
  mean_percent_cover = 0,
  quads_w_lupine = 0
)
surveys_without_lupine <- left_join(surveys_without_lupine, all_surveys) # join back the area, site#, and date fields

lupine <- bind_rows(lupine, surveys_without_lupine)
lupine <- lupine %>%
  mutate(AreaSite = paste0(Area, "-", `Site#`))

Data exploration

With data prepared we can now do some basic exploration. This is all EXTREMELY preliminary.

What are counts of lupine?

Plot basic histograms:

lupine_raw %>%
  filter(Date >= 2016) %>% # no tallies starting in 2016
  ggplot(aes(x = Tally)) +
  geom_histogram(binwidth = 1, fill = "#69b3a2", color = "#e9ecef", alpha = 0.9) +
  theme_clean() +
  labs(
    title = "Lupine counts CHPR",
    subtitle = "All quadrats from 2016 and on included",
    x = "Lupine count",
    y = "Number of quadrats"
  ) +
  theme(
    plot.title = element_text(size = 11, face = "bold"),
    plot.subtitle = element_text(size = 10, face = "italic")
  )

lupine_raw %>%
  filter(Date >= 2016) %>% # no tallies starting in 2016
  ggplot(aes(x = Tally)) +
  facet_wrap(~Area) +
  geom_histogram(binwidth = 1, fill = "#69b3a2", color = "#e9ecef", alpha = 0.9) +
  theme_clean() +
  labs(
    title = "Lupine counts by area",
    subtitle = "All quadrats from 2016 and on included",
    x = "Lupine count",
    y = "Number of quadrats"
  ) +
  theme(
    plot.title = element_text(size = 11, face = "bold"),
    plot.subtitle = element_text(size = 10, face = "italic")
  )

What is maximum density of lupine recorded in surveys?

Tally individuals and sort by max value descending:

lupine_raw %>%
  arrange(desc(Tally)) %>%
  head()

The maximum count of lupine in any quadrat (1 m diameter) is 16, which equals a density of 20.37 individuals/m^2.

Comparing lupine density across sites

Knowing which area/sites have higher/lower density is relevant to reintroduction planning. For analysis of stem density I use data from 2013, when individual plants were counted (later surveys only report percent cover). This code gets lupine tallies in each quadrat in 2013 and backfills zeroes for quadrats where no lupine was detected.

CHPR_2013 <- veg_raw %>%
  mutate(quadrat_string = paste0(Area, "-", `Site#`, "-", NSEW, "-", Date)) %>% # unique string for each quadrat
  filter(Date >= "2013-01-01" & Date < "2014-01-01") # select 2013 surveys

CHPR_2013_quadrats_all <- CHPR_2013 %>%
  group_by(quadrat_string, Area, `Site#`, NSEW, Date) %>%
  count()

CHPR_2013_quadrats_all_vector <- unique(CHPR_2013$quadrat_string)

CHPR_2013_quadrats_with_lupine <- CHPR_2013 %>%
  filter(`Common Name` == "Lupine, wild")

CHPR_2013_quadrats_with_lupine_vector <- unique(CHPR_2013_quadrats_with_lupine$quadrat_string)

CHPR_2013_quadrats_without_lupine_vector <- setdiff(CHPR_2013_quadrats_all_vector, CHPR_2013_quadrats_with_lupine_vector) # vector of surveys without lupine

# create dataframe for surveys without lupine with zero values for lupine measurements
CHPR_2013_quadrats_without_lupine <- data.frame(quadrat_string = CHPR_2013_quadrats_without_lupine_vector, Tally = 0)

CHPR_2013_quadrats_without_lupine <- left_join(CHPR_2013_quadrats_without_lupine, CHPR_2013_quadrats_all) # join back the area, site#, and date fields

CHPR_2013_lupine <- bind_rows(CHPR_2013_quadrats_with_lupine, CHPR_2013_quadrats_without_lupine)

CHPR_2013_lupine <- CHPR_2013_lupine %>%
  mutate(AreaSite = paste0(Area, "-", `Site#`))

Map sites with lupine:

Calculate density by area:

# comparing the mean() and var() of the counts
# mean(CHPR_2013_lupine$Tally) # calculate mean: 0.9
# var(CHPR_2013_lupine$Tally) # calculate variance: 7.3
# poisson model will be overdispersed, so fitting a negative binomial regression for first pass...

# fitting negative binomial model
library(MASS)
fit.nb <- glm.nb(Tally ~ Area - 1, data = CHPR_2013_lupine)
summary(fit.nb)
## 
## Call:
## glm.nb(formula = Tally ~ Area - 1, data = CHPR_2013_lupine, init.theta = 0.1243587739, 
##     link = log)
## 
## Deviance Residuals: 
##      Min        1Q    Median        3Q       Max  
## -0.91007  -0.77875  -0.40569  -0.00006   2.20305  
## 
## Coefficients:
##             Estimate Std. Error z value Pr(>|z|)    
## AreaCHPR04    0.3409     0.5230   0.652   0.5145    
## AreaCHPR05    0.2624     0.4693   0.559   0.5761    
## AreaCHPR06   -0.1823     0.5067  -0.360   0.7190    
## AreaCHPR08  -20.3026  4486.5497  -0.005   0.9964    
## AreaCHPR10    1.2090     0.6457   1.872   0.0612 .  
## AreaCHPR14   -2.1484     0.5262  -4.083 4.45e-05 ***
## AreaCHPR21   -2.4849     1.2923  -1.923   0.0545 .  
## AreaCHPR22  -20.3026  4486.5497  -0.005   0.9964    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for Negative Binomial(0.1244) family taken to be 1)
## 
##     Null deviance: 128.391  on 224  degrees of freedom
## Residual deviance:  91.197  on 216  degrees of freedom
## AIC: 392.72
## 
## Number of Fisher Scoring iterations: 1
## 
## 
##               Theta:  0.1244 
##           Std. Err.:  0.0279 
## 
##  2 x log-likelihood:  -374.7240

Model coefficients in summary above are all on log scale. So now we exponentiate to get average counts and their standard error at each site in the original units. I then convert the count values to density (individuals per m^2 to compare to literature values):

# round(exp(summary(fit.nb)$coefficients), digits = 2)[,1:3] # tally
round((exp(summary(fit.nb)$coefficients) / 0.7854), digits = 2)[, 1:3] # density
##            Estimate Std. Error z value
## AreaCHPR04     1.79       2.15    2.44
## AreaCHPR05     1.66       2.04    2.23
## AreaCHPR06     1.06       2.11    0.89
## AreaCHPR08     0.00        Inf    1.27
## AreaCHPR10     4.27       2.43    8.28
## AreaCHPR14     0.15       2.15    0.02
## AreaCHPR21     0.11       4.64    0.19
## AreaCHPR22     0.00        Inf    1.27

In 2013, 3 CHPR units (#4, #5, #10) had lupine densities greater than the 1.5 stems/m2 minimum required for reintroduction set by Chan and Packer (2006).

How has the population changed over time?

Plot trends at each site over time:

ggplot(
  data = filter(lupine, Area != "CHPR22" & Area != "CHPR08"),
  aes(x = Date, y = mean_percent_cover, group = AreaSite, color = Area)
) +
  geom_line() +
  geom_jitter() +
  theme_clean() +
  labs(
    title = "Lupine percent cover at CHPR sites",
    subtitle = "Average cover across 4 circular quadrats per site, each 1 m in diameter",
    x = "Date",
    y = "Percent cover"
  ) +
  theme(
    plot.title = element_text(size = 11, face = "bold"),
    plot.subtitle = element_text(size = 10, face = "italic")
  )

Of sites with higher densities historically, measured percent cover between 2013 and 2015 :
- consistently decreased in Area #4
- generally increased in Area #5
- generally increased in Area #10

Zoom in:

ggplot(
  data = filter(lupine, Area != "CHPR22" & Area != "CHPR08" & Date >= "2013-01-01"),
  aes(x = Date, y = mean_percent_cover, group = AreaSite, color = Area)
) +
  geom_line() +
  geom_jitter() +
  theme_clean() +
  labs(
    title = "Lupine percent cover at CHPR sites",
    subtitle = "Average cover across 4 circular quadrats per site, each 1 m in diameter",
    x = "Date",
    y = "Percent cover"
  ) +
  theme(
    plot.title = element_text(size = 11, face = "bold"),
    plot.subtitle = element_text(size = 10, face = "italic")
  )

For every site decreasing there’s at least one increasing over this 5-yr interval.

sessionInfo()
## R version 4.2.2 (2022-10-31 ucrt)
## Platform: x86_64-w64-mingw32/x64 (64-bit)
## Running under: Windows 10 x64 (build 22621)
## 
## Matrix products: default
## 
## locale:
## [1] LC_COLLATE=English_United States.utf8  LC_CTYPE=English_United States.utf8   
## [3] LC_MONETARY=English_United States.utf8 LC_NUMERIC=C                          
## [5] LC_TIME=English_United States.utf8    
## 
## attached base packages:
## [1] stats     graphics  grDevices utils     datasets  methods   base     
## 
## other attached packages:
##  [1] MASS_7.3-58.1     viridis_0.6.4     viridisLite_0.4.2 ggrepel_0.9.4     kableExtra_1.3.4  sf_1.0-14        
##  [7] rgdal_1.6-4       sp_2.1-2          effects_4.2-2     carData_3.0-5     ggthemes_5.0.0    knitr_1.45       
## [13] readxl_1.4.3      RODBC_1.3-23      auk_0.7.0         lubridate_1.9.3   forcats_1.0.0     stringr_1.5.1    
## [19] dplyr_1.1.4       purrr_1.0.2       readr_2.1.4       tidyr_1.3.0       tibble_3.2.1      ggplot2_3.4.4    
## [25] tidyverse_2.0.0   here_1.0.1       
## 
## loaded via a namespace (and not attached):
##  [1] nlme_3.1-160       bit64_4.0.5        webshot_0.5.5      httr_1.4.7         insight_0.19.7    
##  [6] rprojroot_2.0.4    tools_4.2.2        bslib_0.6.1        utf8_1.2.4         R6_2.5.1          
## [11] KernSmooth_2.23-20 DBI_1.1.3          colorspace_2.1-0   nnet_7.3-18        withr_2.5.2       
## [16] gridExtra_2.3      tidyselect_1.2.0   bit_4.0.5          compiler_4.2.2     rvest_1.0.3       
## [21] cli_3.6.1          xml2_1.3.5         labeling_0.4.3     sass_0.4.7         scales_1.3.0      
## [26] classInt_0.4-10    proxy_0.4-27       systemfonts_1.0.5  digest_0.6.33      minqa_1.2.6       
## [31] svglite_2.1.2      rmarkdown_2.25     pkgconfig_2.0.3    htmltools_0.5.7    lme4_1.1-35.1     
## [36] fastmap_1.1.1      highr_0.10         rlang_1.1.2        rstudioapi_0.15.0  farver_2.1.1      
## [41] jquerylib_0.1.4    generics_0.1.3     jsonlite_1.8.7     vroom_1.6.4        magrittr_2.0.3    
## [46] Matrix_1.6-4       Rcpp_1.0.11        munsell_0.5.0      fansi_1.0.5        lifecycle_1.0.4   
## [51] stringi_1.8.2      yaml_2.3.7         epuRate_0.1        grid_4.2.2         parallel_4.2.2    
## [56] crayon_1.5.2       lattice_0.20-45    splines_4.2.2      hms_1.1.3          pillar_1.9.0      
## [61] boot_1.3-28        glue_1.6.2         evaluate_0.23      mitools_2.4        vctrs_0.6.5       
## [66] nloptr_2.0.3       tzdb_0.4.0         cellranger_1.1.0   gtable_0.3.4       assertthat_0.2.1  
## [71] cachem_1.0.8       xfun_0.41          survey_4.2-1       e1071_1.7-13       class_7.3-20      
## [76] survival_3.4-0     units_0.8-5        timechange_0.2.0
 




By Sam Safran